Feature crossing

2024-10-21

A synthetic feature formed by “crossing” categorical or bucketed features. Formally, a cross is a Cartesian product.¹

For example, consider a “mood forecasting” model that represents temperature in one of the following four buckets:

freezing
chilly
temperate
warm

And represents wind speed in one of the following three buckets:

still
light
windy

Without feature crosses, the linear model trains independently on each of the preceding seven various buckets. So, the model trains on, for example, freezing independently of the training on, for example, windy. Alternatively, you could create a feature cross of temperature and wind speed. This synthetic feature would have the following 12 possible values:¹

freezing-still
freezing-light
freezing-windy
chilly-still
chilly-light
chilly-windy
temperate-still
temperate-light
temperate-windy
warm-still
warm-light
warm-windy

Thanks to feature crosses, the model can learn mood differences between a freezing-windy day and a freezing-still day.

Feature crosses are mostly used with linear models and are rarely used with neural networks.

Footnotes

developers.google.com/machine-learning/glossary#feature_cross ↩ ↩²